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We have investigated decorrelation of samples in Quantum Monte Carlo (QMC) ground-state 
energy calculations for large Li and H2O nanoclusters. Binning data as a way of eliminating sta- 
tistical correlations, as is the common practice, is found to become increasingly impractical as the 
system size grows. We demonstrate nevertheless that it is possible to perform accurate energy 
calculations— without decorrelating samples— by exploiting the scaling of the integrated autocorre- 
lation time r as a function of the number of electrons in the system. 
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Quantum Monte Carlo (QMC) methods are becom- 
ing increasingly important in electronic structure calcu- 
lations for a variety of systems of current interest [l|, Q ■ 
Although effective QMC methods have been developed 
in recent years to optimize the parameters of many-body 
wavefunctions 0, 0, H, 0, 0, S] , QMC calculations are fun- 
damentally limited by the statistical accuracy that can 
be obtained in a reasonable amount of computation time. 
Despite the growingmaturity of core QMC techniques in 
terms of efficiency , computations of large clusters 

are hampered by increasing correlations in samples with 
increasing system size. In this connection, scaling of the 
computation time required to obtain a given statistical 
accuracy has been reported as a function of the atomic 
number Z 11, 12, 1^, and also as a function of the num- 
ber of atoms M in non- metallic clusters lj|. Here we 
discuss the effects of correlated sampling in Li and H2O 
clusters. 

When the number of atoms in a metallic cluster in- 
creases, the gap to the excited states shrinks as the elec- 
tronic states begin to form a band and approach the 
bulk metallic character. This shrinking gap between the 
ground state and the excited states induces progressively 
greater statistical correlations among samples due to the 
Markovian character of the QMC sampling algorithm 
[lit • We address this problem through an analysis of 
the integrated autocorrelation time r [16| which has not 
been considered in previous work on scaling; in partic- 
ular, Williamson, et al. 14| do not consider this cor- 



relation factor in their studies of non-metallic clusters. 
We demonstrate with the example of Li clusters that the 
problem of correlations among samples increases rapidly 
as the system size grows, making it increasingly difficult 
to decorrelate samples and determine the statistical error 
of the QMC calculation. Nevertheless, by invoking the 
scaling relationship of r, efficient computations of large 
clusters become feasible, even in the presence of highly 
correlated samples. 

The relevant technical details of our calculations are as 
follows. For a given many-body wave function, $0(2;), we 
employ Umrigar's modified version of discrete Langevin 



dynamics [lO| to generate a sequence of QMC samples 
[l7|. Our QMC runs were carried out using a modifica- 
tion of the QMcBeaver code [11]. We chose to move all 
electrons during each QMC step rather than performing 
single-electron moves [l9j. In all cases, an acceptance 
ratio of 50% was maintained. We note that achieving 
equilibration for large clusters is quite demanding. In 
particular, the Li64 and (H20)2n clusters each required 
about one week to equilibrate [20|. 

After equilibration, the QMC walker's steps follow a 
probability distribution tt{x) — |$o(a^)P- The estimate 
of the total energy E—< ^q\H\^o > is obtained from the 
average of the local energies h{x) = H^o{x)/^o{x) over 
the steps taken by the walker. Estimates of E from mul- 
tiple independent calculations are distributed according 
to a normal distribution, with variance estimated by 

M 



a2«a2(l + 2^a(z)). 



(1) 



where CTq is the estimate of the variance of E calculated as 
though the samples were uncorrelated, i.e. — ^raw/-^j 
where cr^^^, is the variance of h{x) over N steps. Ch(i), 
the autocorrelation of ft. at i steps, is given by Ch (*) = 
(h {x^^ h [x^^^^^ — (h{x^^^ (normalized to the point 
i = 0), where j is an arbitrary step number in the post- 
equilibration phase, and the brackets (. . .) denote the 
expectation value over an infinite ensemble. The "inte- 
grated autocorrelation time" r is then defined such that 
the variance 



(2) 



so that the "effective" number of steps taken by the 
walker is given not by N , but by N/t. Eq. [2] gener- 
alizes the statistical error whose scaling with system size 
has been discussed by Williamson, et al. [1,4:] by the ad- 
dition of the quantity t, the factor which describes the 
effect of statistical correlations between QMC moves. In 
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FIG. 1: (Color online) Normalized trial standard deviation 
(Tt /(To and its error bar for 4 to 64 atom Li clusters. Samples 
are decorrelated when at /(Jo displays a plateau, where the 
height of the plateau is related from Eq. |4]to the square root 
of the integrated correlation time, i.e. = u/cro. Inset 

shows the linear scaling of with the number of electrons 
Nei in the system. Error bars on data points are comparable 
to dot sizes. 



the discrete case at liand, we have 

oo 

T=(l + 2^C,.(i)) . (3) 

1=1 

By comparing Eqs. ([T]) and ([3]), we see that 



The ratio in Eq. [3] is sometimes also called the "statis- 
tical inefficiency"; see Refs. 2l|,|2^|23j. In order to de- 



termine the ratio [a /oqY and thus obtain r, we employed 
the recently developed Dynamic Distributable Decorre- 
lation Algorithm (DDDA) ^ to block data efficiently. 
Like other data blocking algorithms 25|, the DDDA is 
a renormalization group method that relies on the ap- 
pearance of a plateau to determine the correct variance 
of the mean value of the energy. The DDDA performs 
the blocking with nearly zero overhead both in time and 
in memory, and for these reasons it is well-suited for our 
study of the convergence properties of large systems. 

Simulations were carried out on Li clusters with 4, 8, 
20, 26 and 64 atoms [2^. Li is an interesting element 
because it displays subtle bonding properties 27, [il ]. 
Moreover, it contains only a few electrons per atom, 
which makes it possible to consider relatively large Li 
clusters. We have used Hartree-Fock [i^ many-body 
wave functions ^o{x) constructed with one-particle or- 
bitals obtained with the software Jaguar [30]. 



Fig. [T] shows the value at/cro plotted as a function 
of block size for Li clusters varying in size from 4-64 
atoms, where Cf is the trial standard deviation at var- 
ious block sizes in the computation. The horizontal axis 
gives the number of adjacent QMC samples (block size) 
pulled piecewise from the full set of samples and aggre- 
gated, creating effective samples which are used in the 
calculation of at- The appearance of a plateau, or flat 
region, in the CTf/ao plot indicates full decorrelation of 
samples. The height of this plateau yields the true value 
of the normalized standard deviation ct/ctoj which from 
Eq. |4] gives the correct value of the square root of the 
integrated correlation time, -Jr. For example, for the 
4 atom cluster with Ne=\2 electrons, the at/cTQ curve 
(black line) flattens to a value of about 2, which is the 
value shown for y/r in the inset. Similarly, the plateau 
for 20 Li atoms with 60 electrons (deep blue line) appears 
for (Tt/(To=2.7. 

An increase in the height of the plateau, i.e., in the 
value of cr/(To or equivalently of with cluster size is 
evident in Fig. [T] More quantitatively, the inset reveals a 
linear relationship of u/cto with the number of electrons 
Nei in the cluster. It thus follows that the autocorre- 
lation time diverges quadratically with system size. We 
emphasize that the total number of points rip needed to 
reach the plateau for a given cluster size is much larger 
than the corresponding value of r. For example, for the 
64 atom cluster the plateau appears in Fig. [T]at the value 
14 or equivalently for Up = 2^'', while r = (4.3)^ = 18.5. 
In other words, one would need to skip 2^^ steps in this 
case if one wanted to decorrelate samples. Note that r 
is an integrated correlation time; therefore, skipping r 
steps will not produce decorrelated samples. Clearly, be- 
(4) cause Up grows so large, calculation of the error using 
reblocking rapidly becomes intractable for large clusters. 

The total time of a computation can be written as 
T = tgN, where ts is the time per step and N is the 
total number of steps. From Eq. [21 we see that in order 
to maintain a given accuracy a for QMC calculations of 
different clusters, the number of steps N must change to 
balance changes in cr^^j^ and r. For quadratically scal- 
ing r, the number of steps N, and therefore the total 
computation time, will have an additional scaling factor 
of up to M^, where M denotes the number of atoms in 
the cluster, in order to compensate for the scaling of r. 
This additional scaling factor for r is present whether the 
quantity being calculated is the energy for the entire sys- 
tem, or the energy per atom. The scaling factors due to 



c'^aw aii'i have been considered in previous work 14 1 



and are not considered in this study. 

A QMC calculation is complete only when accompa- 
nied by a measure a of its statistical uncertainty, which is 
given by the onset of the decorrelation plateau in a plot 
such as that of Fig. [TJ The failure to obtain this onset in- 
dicates that it is impossible to decorrelate samples during 
the run. This "stickiness" of correlated samples in QMC 



runs impedes our ability to determine the statistical error 
a of the computation. Often, in order to decorrelate sam- 
ples, the problem of stickiness is overcome by obtaining 
a large number of sample points and discarding inter- 
mediate correlated points. However, as we have already 
pointed out above, it becomes increasingly difficult in 
large clusters to obtain the plateau and decorrelate sam- 
ples. It may in fact be impossible to decorrelate even a 
single sample during the run. In sharp contrast, our pro- 
cedure requires that all steps are sampled, with no steps 
discarded, in order to first calculate CTq, and then obtain 
cr^ by rescaling with an extrapolated value of r. In this 
way, our scheme provides a route for circumventing the 
problem of decorrelating samples in large clusters. 

As an example of an energy calculation obtained with- 
out observing the onset of a decorrelation plateau, we 
were able to obtain the energy (per atom) for the Lig4 
cluster to within a = 0.1 eV (±10%) using only 10^ 
QMC steps (sij. We have verified our estimate of the 
accuracy of a for this energy calculation by performing 
multiple, independently equilibrated runs and observing 
the spread of the energy results. 

We turn now to consider II2O clusters, where we have 
studied clusters of 2, 4, 8, 9, 14 and 20 molecules [s^ . 
These clusters have similar electron counts to those of the 
Li clusters discussed above. Fig. [2] shows decorrelation 
plots as a function of block size. All clusters display the 
onset of the decorrelation plateau. A comparison of the 
insets in Figs. [Hand [2] shows that the value of ^/t = (t/cto 
varies significantly less rapidly with system size in H2O 
clusters. It is striking that all curves in Fig. [2] collapse 
onto an essentially universal curve up to five block trans- 
formations. Beyond this block size, however, it appears 
that other secondary longer-distance correlations play a 
noticeable role so that the curves begin to rise again 
and settle into new, weakly size-dependent plateaux. If 
it were not for this secondary feature, all II2O clusters 
would achieve their plateaux in the same small number of 
blocking steps, and the stickiness induced by correlated 
sampling would not be important. Because of these sec- 
ondary features, II2O clusters exhibit quadratic scaling 
of T with M, but the influence is significantly less severe 
than for Li. In fact, if a -I- hN^i describes a linear fit for 
the data in the inset for II2O, we see that for the cluster 
sizes we studied, the constant term in the quadratic ex- 
pression T = (v^) = (a + hNf-if dominates. This may 
explain why the phenomenon was not observed in Ref. 
[14j . For Li, the linear term dominates and adds to the 
scaling of the total computation time for the cluster sizes 
we studied. 

We now provide a possible explanation for the diver- 
gence of the autocorrelation time r in ground state com- 
putations of Li clusters. We recall that bulk Li metal 
contains discontinuities in the momentum density of the 
electron gas at the Fermi momentum. Such a sharp cut- 
off in the momentum distribution can be expected to 
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FIG. 2: (Color online) Same as Fig. [T] except that this figure 
refers to H2O clusters. 



generally cause oscillations of electron configurations in- 
volved in the QMC algorithm and thereby slow the sta- 
tistical convergence in clusters like those of Li atoms, 
which must eventually approach the bulk metallic state 
with increasing size. These considerations suggest that 
smearing or broadening the underlying "Fermi surface" 
(formally, of course, there is no Fermi surface in a finite 
system) in metallic nanoclusters using a trial wavefunc- 
tion such as an AGP 33] or a Pfaflian [34] might result in 
shorter autocorrelation times. Notably, Baroni and Mo- 
roni [35^ have discussed a theoretical connection between 
the autocorrelation time and the spectrum of an auxiliary 
quantum Hamiltonian, which might help explain the di- 
vergence of the autocorrelation time in metallic systems. 

The integrated autocorrelation time r is related to the 
second largest eigenvalue A of the transition matrix by 
T — 1 + 2A/(1 — A) [1^. Very recent work regarding 
the scaling of A in the Metropolis alorithm demonstrates 
that for log-concave distributions l<I>o(a::)l^, the scahng of 
T with M is at most [s^. For Langevin dynam- 
ics the divergence of r is usually less severe than in the 
Metropolis algorithm 37|. In principle, the appearance 
of nodes in the wavefunction can violate the log-concavity 
of ]$o(a;)l^ and might modify the scaling properties of r. 
Although our wavefunctions contain nodes, the fact that 
we obtain a clear scaling of r suggests that the 
scaling remains maximal for typical QMC distributions 
l$o(a^)l'^ more generally [38]. 

In conclusion, we have investigated the scaling of the 
integrated autocorrelation time r in QMC computations 
of Li and H2O clusters. The quantity t, which directly 
yields a measure of sampling correlations in the calcula- 
tion, is found to diverge quadratically with system size 
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M in both cases, more severely for Li than for H2O, and 
becomes increasingly difficult to calculate for large clus- 
ters. We have shown that brute force decorrelation of 
samples is simply not feasible for these large clusters. 
Our study highlights the importance of correlated sam- 
pling in QMC computations of large clusters and provides 
a route for obtaining accurate error estimates by exploit- 
ing the scaling properties of r with system size, without 
requiring explicit decorrelation of samples. The present 
results should be relevant for the growing scientific com- 
munity [l6| interested in the convergence properties of 
Markov chains and QMC calculations of properties of 
nanoclusters. Further work towards understanding the 
effects of excitation gaps on the convergence of the QMC 
algorithm might be valuable. 
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